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Abstract 

Background: Rabbit haemorrhagic disease virus (RHDV), as the pathogeny of Rabbit haemorrhagic disease, can 
cause a highly infectious and often fatal disease only affecting wild and domestic rabbits. Recent researches 
revealed that it, as one number of the Caliciviridae, has some specialties in its genome, its reproduction and so on. 

Results: In this report, we firstly analyzed its genome and two open reading frameworks (ORFs) from this aspect of 
codon usage bias. Our researches indicated that mutation pressure rather than natural is the most important 
determinant in RHDV with high codon bias, and the codon usage bias is nearly contrary between 0RF1 and 0RF2, 
which is maybe one of factors regulating the expression of VP60 (encoding by 0RF1) and VP10 (encoding by 
0RF2). Furthermore, negative selective constraints on the RHDV whole genome implied that VP10 played an 
important role in RHDV lifecycle. 

Conclusions: We conjectured that VP10 might be beneficial for the replication, release or both of virus by 
inducing infected cell apoptosis initiate by RHDV. According to the results of the principal component analysis for 
0RF2 of RSCU, we firstly separated 30 RHDV into two genotypes, and the ENC values indicated 0RF1 and 0RF2 
were independent among the evolution of RHDV. 
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1. Background 

Synonymous codons are not used randomly [1]. The 
variation of codon usage among ORFs in different 
organisms is accounted by mutational pressure and 
translational selection as two main factors [2,3]. Levels 
and causes of codon usage bias are available to under- 
stand viral evolution and the interplay between viruses 
and the immune response [4]. Thus, many organisms 
such as bacteria, yeast, Drosophila, and mammals, have 
been studied in great detail up on codon usage bias and 
nucleotide composition [5]. However, same researches 
in viruses, especially in animal viruses, have been less 
studied. It has been observed that codon usage bias in 
human RNA viruses is related to mutational pressure, G 
+C content, the segmented nature of the genome and 

* Correspondence: liujixing@hotmail.com 
f Contributed equally 

State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of 
Grazing Animal Diseases of Ministry of Agriculture, Key Laboratory of Animal 
Virology of Ministry of Agriculture, State Key Laboratory of Veterinary 
Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy 
of Agricultural Sciences, Xujia ping 1, Yanchang bu, Lanzhou, Gansu, Post 
Code 730046, China 



the route of transmission of the virus [6]. For some ver- 
tebrate DNA viruses, genome-wide mutational pressure 
is regarded as the main determinant of codon usage 
rather than natural selection for specific coding triplets 
[4]. Analysis of the bovine papillomavirus type 1 (BP VI) 
late genes has revealed a relationship between codon 
usage and tRNA availability [7]. In the mammalian 
papillomaviruses, it has been proposed that differences 
from the average codon usage frequencies in the host 
genome strongly influence both viral replication and 
gene expression [8]. Codon usage may play a key role in 
regulating latent versus productive infection in Epstein- 
Barr virus [9]. Recently, it was reported that codon 
usage is an important driving force in the evolution of 
astroviruses and small DNA viruses [10,11]. Clearly, stu- 
dies of synonymous codon usage in viruses can reveal 
much about the molecular evolution of viruses or indivi- 
dual genes. Such information would be relevant in 
understanding the regulation of viral gene expression. 

Up to now, little codon usage analysis has been per- 
formed on Rabbit haemorrhagic disease virus (RHDV), 
which is the pathogen causing Rabbit haemorrhagic 
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disease (RHD), also known as rabbit calicivirus disease Table 1 Information of RHDV genomes 
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and GenBank accession numbers are listed in Table 1. 



2.2 The relative synonymous codon usage (RSCU) in 
RHDV 

To investigate the characteristics of synonymous codon 
usage without the influence of amino acid composition, 
RSCU values of each codon in a ORF of RHDV were 
calculated according to previous reports (2 Sharp, 
Tuohy et al 1986) as the followed formula: 



RSCU 



i 



Where gij is the observed number of the ith codon for 
;th amino acid which has ni type of synonymous codons. 
The codons with RSCU value higher than 1.0 have posi- 
tive codon usage bias, while codons with value lower 
than 1.0 has relative negative codon usage bias. As 
RSCU values of some codons are nearly equal to 1.0, it 
means that these codons are chosen equally and 
randomly. 



2.3 The content of each nucleotides and G+C at the 
synonymous third codon position (GC3s) 

The index GC3s means the fraction of the nucleotides 
G+C at the synonymous third codon position, excluding 
Met, Trp, and the termination codons. 

2.4 The effective number of codons (ENC) 

The ENC, as the best estimator of absolute synonymous 
codon usage bias [16], was calculated for the quantifica- 
tion of the codon usage bias of each ORF [17]. The pre- 
dicted values of ENC were calculated as 



ENC = 2 + 5 + 



29 



5 2 + (l-s 2 ) 



where s represents the given (G+C) 3 % value. The 
values of ENC can also be obtained by EMBOSS CHIPS 
program [18]. 
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2.5 Dn and ds of two ORFs 

Analyses were conducted with the Nei-Gojobori model 
[19], involving 30 nucleotide sequences. All positions con- 
taining gaps and missing data were eliminated. The values 
of dn, ds and oa (dn/ds) were calculated in MEGA4.0 [20]. 

2.6 Correspondence analysis (COA) 

Multivariate statistical analysis can be used to explore the 
relationships between variables and samples. In this study, 
correspondence analysis was used to investigate the major 
trend in codon usage variation among ORFs. In this study, 
the complete coding region of each ORF was represented 
as a 59 dimensional vector, and each dimension corre- 
sponds to the RSCU value of one sense codon (excluding 
Met, Trp, and the termination codons) [21]. 

2.7 Correlation analysis 

Correlation analysis was used to identify the relationship 
between nucleotide composition and synonymous codon 



usage pattern [22]. This analysis was implemented based 
on the Spearman's rank correlation analysis way. 

All statistical processes were carried out by with statis- 
tical software SPSS 17.0 for windows. 

3. Results 

3.1 Measures of relative synonymous codon usage 

The values of nucleotide contents in complete coding 
region of all 30 RHDV genomes were analyzed and 
listed in Table 2 and Table 3. Evidently, (C+G)% content 
of the ORF1 fluctuated from 50.889 to 51.557 with a 
mean value of 51.14557, and (C+G)% content of the 
ORF2 were ranged from 35.593 to 40.113 with a mean 
value of 37.6624, which were indicating that nucleotides 
A and U were the major elements of ORF2 against 
ORF1. Comparing the values of A 3 %, U 3 %, C 3 % and 
G 3 %, it is clear that C 3 % was distinctly high and A 3 % 
was the lowest of all in ORF1 of RHDV, while U 3 % was 
distinctly high and C 3 % was the lowest of all in ORF2 of 



Table 2 Identified nucleotide contents in complete coding region (length > 250 bps) in the ORF1 of RHDV (30 isolates) 
genome 



SN 


A% 


A 3 % 


U% 
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C% 
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G% 
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23.340 
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24.691 
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Table 3 Identified nucleotide contents in complete coding region (length > 250 bps) in the ORF2 of RHDV (30 isolates) 
genome 
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U% 
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C 3 % 


G% 


G 3 % 


(C+G)% 


(C 3 +G 3 )% 
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1 


29.944 


17.797 


30.791 


44.068 


13.842 


16.102 


25.424 


22.034 


39.266 


38.136 


49.377 
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29.944 


18.644 


30.226 


43.220 


14.407 


16.949 


25.424 


21.186 


39.831 


38.135 


48.182 
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31.356 


20.339 


31.638 


46.610 


12.994 


13.559 


24.01 1 


19.492 


37.005 


33.051 


44.567 
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30.508 


1 8.644 


30.791 


44.915 


13.842 


15.254 


24.859 


21.186 


38.701 


36.440 


46.686 
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29.944 


17.797 


31.921 


46.610 


12.712 


13.559 


25.424 


22.034 


38.136 


35.593 


41.215 
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30.226 


16.949 


30.226 


43.220 


14.407 


16.949 


25.141 


22.881 


39.548 


39.830 


51.964 
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31.356 


19.492 


30.791 


45.763 


14.124 


15.254 


23.729 


19.492 


37.853 


34.764 


45.757 
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30.226 


16.949 


29.661 


43.220 


15.254 


17.797 


24.859 


22.034 


40.113 


39.831 


47.242 


9 


30.508 


18.644 


31.356 


45.763 


13.277 


14.407 


24.859 


21.186 


38.136 


35.593 


43.017 


10 


31.356 


20.339 


31.638 


46.610 


12.994 


13.559 


24.01 1 


19.492 


37.005 


33.051 


44.576 


11 


29.782 


17.518 


33.898 


48.175 


12.107 


13.139 


24.213 


21.168 


36.320 


34.307 


43.088 


12 


31.638 


21.186 


31.073 


45.763 


12.994 


13.559 


24.294 


19.492 


37.288 


33.051 


44.997 


13 


31.073 


18.644 


31.638 


46.610 


13.277 


14.407 


24.01 1 


20.339 


37.288 


34.746 


43.213 


14 


31.638 


19.492 


31.921 


47.458 


12.994 


13.559 


23.446 


19.492 


36.440 


33.051 


47.214 


15 


31.921 


20.339 


31.921 


46.610 


12.712 


13.559 


23.446 


19.492 


36.158 


33.051 


41.964 


16 


30.226 


1 8.644 


30.508 


43.220 


14.124 


1 6.949 


25.141 


21.186 


39.265 


38.135 


47.603 


17 


30.508 


19.492 


30.508 


43.220 


13.559 


15.254 


25.424 


22.034 


38.983 


37.288 


47.615 


18 


29.096 


16.102 


31.356 


45.763 


13.277 


14.407 


26.271 


23.729 


39.548 


38.136 


44.343 


19 


30.226 


19.492 


31.073 


44.915 


13.559 


15.254 


25.141 


20.339 


38.700 


35.593 


46.768 


20 


31.638 


19.492 


32.768 


49.153 


11.864 


11.017 


23.729 


20.339 


35.593 


31.356 


39.771 


21 


31.638 


19.492 


32.768 


49.153 


11.864 


11.017 


23.729 


20.339 


35.593 


31.356 


39.771 


22 


31.073 


19.492 


31.356 


45.763 


12.994 


13.559 


24.576 


21.186 


37.570 


34.745 


43.282 


23 


31.356 


19.492 


31.921 


47.458 


12.994 


13.559 


23.729 


19.492 


36.723 


33.051 


42.633 


24 


31.638 


20.339 


31.921 


47.458 


12.994 


13.559 


23.446 


18.644 


36.440 


32.203 


42.157 


25 


31.638 


20.339 


32.203 


48.305 


12.712 


12.712 


23.446 


18.644 


36.185 


31.356 


40.006 


26 


31.638 


20.339 


32.203 


48.305 


12.712 


12.712 


23.446 


18.644 


36.185 


31.356 


40.006 


27 


30.226 


17.797 


31.073 


44.915 


13.559 


15.254 


25.141 


22.034 


38.700 


37.288 


42.799 


28 


31.356 


18.644 


31.356 


45.763 


13.559 


15.254 


23.729 


20.339 


37.288 


35.593 


45.413 


29 


31.638 


21.186 


31.638 


46.610 


12.712 


12.712 


24.01 1 


19.492 


36.723 


32.204 


43.618 


30 


31.638 


21.186 


31.073 


45.763 


12.994 


13.559 


24.294 


19.492 


37.288 


32.721 


44.997 



RHDV. The (C 3 +G 3 ) % in ORF1 fluctuated from 57.014 
to 58.977 with a mean value of 57.68287 and (C 3 +G 3 )% 
were range from 31.356 to 39.831 with a mean value of 
34.8337. And the ENC values of ORF1 fluctuated from 
54.192 to 55.491 with a mean value of 54.95 and ENC 
values of ORF2 displayed a far-ranging distribution from 
39.771 to 51.964 with a mean value of 44.46. The ENC 
values of ORF1 were a little high indicating that there is 
a particular extent of codon preference in ORF1, but the 
codon usage is relatively randomly selected in ORF2 on 
the base of ENC values. The details of the overall rela- 
tive synonymous codon usage (RSCU) values of 59 
codons for each ORF in 30 RHDV genomes were listed 
in Table 4. Most preferentially used codons in ORF1 
were C-ended or G-ended codons except Ala, Pro and 
Ser, however, A-ended or G-ended codons were pre- 
ferred as the content of ORF2. 

In addition, the dn, ds and oa(dN/dS) values of ORF1 
were separately 0.014, 0.338 and 0.041, and the values of 



ORF2 were 0.034, 0.103 and 0.034, respectively. The co 
values of two ORFs in RHDV genome are generally low, 
indicating that the RHDV whole genome is subject to 
relatively strong selective constraints. 

3.2 Correspondence analysis 

CO A was used to investigate the major trend in codon 
usage variation between two ORFs of all 30 RHDV 
selected for this study. After COA for RHDV Genome, 
one major trend in the first axis (fi) which accounted 
for 42.967% of the total variation, and another major 
trend in the second axis (f 2 ) which accounted for 
3.632% of the total variation. The coordinate of the 
complete coding region of each ORF was plotted in Fig- 
ure 1 defining by the first and second principal axes. It 
is clear that coordinate of each ORF is relatively iso- 
lated. Interestingly, we found that relatively isolated 
spots from ORF2 tend to cluster into two groups: the 
ordinate value of one group (marked as Group 1) is 
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Table 4 Synonymous codon usage of the whole coding sequence in RHDV 



AA a 


Codon 


RSCU in 0RF1 


RSCU in ORF2 


AA a 


Codon 


RSCU in ORF1 


RSCU in ORF2 


Ala 


GCA 


1.238761 


0.877698 


Leu 


CUA 


0.582651 


0.410596 




GCC 


1 .22443 1 


1.165468 




cue 


1 .349825 


0.397351 




GCG 


0.567437 


0.014388 




CUG 


1.188367 


0.900662 




GCU 


0.969371 


1 .942446 




CUU 


1.107137 


0.821 192 


Arg 


AGA 


1 .266604 


1.481013 




UUA 


0.498412 


1.350993 




AGG 


2.026193 


3.341772 




UUG 


1 .273609 


2.1 19205 




CGA 


0.303087 


o 


Lvs 


AAA 


0.699282 


0.837209 




CGC 


0.991581 


1.177215 




AAG 


1.300718 


1.162791 




CGG 


0.445276 


0 


Phe 


UUC 


0.909962 


0.360902 




CGU 


0.967259 


o 




UUU 


1 .090038 


1 .639098 


Asn 


AAC 


1.562517 


0.140845 


Pro 


CCA 


1 .370342 


2 




AAU 


0.437483 


1.859155 




CCC 


1 .204832 


0.451613 


Asp 


GAC 


1.576108 


0.909091 




CCG 


0.45541 


0 




GAU 


0.423892 


1 .090909 




ecu 


0.969417 


1 .548387 


Cvs 


UGC 


1 .034803 


0 


Ser 


AGC 


0.969041 


1.567416 




UGU 


0.965197 


o 




AGU 


1.104135 


3.370787 


Gin 


CAA 


0.798416 


1.651613 




UCA 


1 .437974 


o 




CAG 


1.201584 


0.348387 




UCC 


1.226239 


0.522472 


Glu 


GAA 


0.843523 


0.8 




UCG 


0.558562 


0 




GAG 


1.156477 


1.2 




UCU 


0.704048 


0.539326 


Gly 


GGA 


0.669081 


0.797508 


He 


AUA 


0.574538 


o 




GGC 


1 .262976 


0.984424 




AUC 


1.247451 


0.525 




GGG 


0.944991 


0.398754 




AUU 


1.17801 


2.475 




GGU 


1.122952 


1.819315 


Tyr 


UAC 


1.285714 


0.086022 


His 


CAC 


1.412429 


0 




UAU 


0.714286 


1.913978 




CAU 


0.587571 


2 


Val 


GUA 


0.316211 


0.763077 


Thr 


ACA 


1.212516 


0.129032 




GUC 


1.050408 


0.258462 




ACC 


1.379635 


2 




GUG 


1.163066 


0.615385 




ACG 


0.496292 


0 




GUU 


1.470315 


2.363077 




ACU 


0.911557 


1 .870968 











positive value and the other one (marked as Group 2) is 
negative value. Interestingly, all of those strains isolated 
before 2000 belonged to Group 2. 

3.3 Correlation analysis 

To estimate whether the evolution of RHDV genome 
on codon usage was regulated by mutation pressure or 
natural selection, the A%, U%, C%, G% and (C+G)% 
were compared with A 3 %, U 3 %, C 3 %, G 3 % and (C 3 
+G 3 )%, respectively (Table 5). There is a complex cor- 
relation among nucleotide compositions. In detail, 
A 3 %, U 3 %, C 3 % and G 3 % have a significant negative 
correlation with G%, C%, U% and A% and positive cor- 
relation with A%, U%, C% and G%, respectively. It sug- 
gests that nucleotide constraint may influence 
synonymous codon usage patterns. However, A 3 % has 
non-correlation with U% and C%, and U 3 % has non- 
correlation with A% and G%, respectively, which 
haven't indicated any peculiarity about synonymous 
codon usage. Furthermore, C 3 % and G 3 % have non- 



correlation with A%, G% and U%, C%, respectively, 
indicating these data don't reflect the true feature of 
synonymous codon usage as well. Therefore, linear 
regression analysis was implemented to analyze the 
correlation between synonymous codon usage bias and 
nucleotide compositions. Details of correlation analysis 
between the first two principle axes (fi and f 2 ) of each 
RHDV genome in COA and nucleotide contents were 
listed in Table 6. In surprise, only f2 values are closely 
related to base nucleotide A and G content on the 
third codon position only, suggesting that nucleotide A 
and G is a factor influencing the synonymous codon 
usage pattern of RHDV genome. However, f 1 value has 
non-correlation with base nucleotide contents on the 
third codon position; it is observably suggest that 
codon usage patterns in RHDV were probably influ- 
enced by other factors, such as the second structure of 
viral genome and limits of host. In spite of that, com- 
positional constraint is a factor shaping the pattern of 
synonymous codon usage in RHDV genome. 
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+ 0RF2 
x 0RF1 
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to 
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-3 



+ 
+ 

+ 
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-1.5 



-1.0 
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1.5 



fl (42.967%) 

The first axis (/'i) accounts for 42.967% of the total variation, and the second axis (f^) accounts 
for 3 .632% of the total variation. 

Figure 1 A plot of value of the first and second axis of RHDV genome in COA. The first axis {fj accounts for 42.967% of the total variation, 
and the second axis (f 2 ) accounts for 3.632% of the total variation. 



Table 5 Summary of correlation analysis between the A, 
U, C, G contents and A 3 , U 3 , C 3 , G 3 contents in all 
selected samples 





A 3 % 


U 3 % 


C 3 % 


G 3 % 


(C 3 +G 3 )% 


A% 


r = 0.869** 


r = 




r = 


r = 






-0.340 NS 


-0.358 NS 


-0.865** 


-0.266** 


U% 


r = -0.436 NS 


r = 0.921** 


r = 
-0.902** 


r = 

-0.366 NS 


r = 
-0.652** 


c% 


r = 0.376 NS 


r = 

-0.919** 


r = 0.932** 


r = 

-0.352 NS 


r = 0.692** 


G% 


r = -0.860** 


-0.377 NS 


-0.437 NS 


r = 0.910** 


r = 0.220** 


(C 
+G)% 


r = -0.331 

NS 


r = 

-0.649** 


r = 0.636** 


r = 0.399* 


r = 0.915** 



a r value in this table is calculated in each correlation analysis. 
NS means non-significant (p > 0.05). 
* means 0.01 < p < 0.05 
**means p < 0.01 



Table 6 Summary of correlation analysis between the fl, 
f2 contents and A 3 , U 3 , C 3 , G 3 , C3+G3 contents in all 
selected samples 



Base compositions 


f 7 ' (42.967%) 


f 2 ' (3.632%) 


A 3 % 


r = -0.05 1 NS 


r = -0.740** 


U 3 % 


r = 0.243 NS 


r = 0.314 NS 


C 3 % 


r = -0.291 NS 


r = -0.298 NS 


G 3 % 


r = 0.108 NS 


r = 0.723** 


(C 3 +G 3 )% 


r = -0.216 NS 


r = 0.205 NS 



a r value in this table is calculated in each correlation analysis. 
NS means non-significant. 
* means 0.01 < p < 0.05 
**means p < 0.01 



Tian et al. Virology Journal 201 1, 8:494 
http://www.virologyj.eom/content/8/1/494 



Page 7 of 9 



4. Discussion 

There have been more and more features that are unique 
to RHDV within the family Caliciviridae, including its sin- 
gle host tropism, its genome and its VP 10 as a structural 
protein with unknown function. After we analyzed synon- 
ymous codon usage in RHDV (Table 2), we obtained sev- 
eral conclusions and conjectures as followed. 

4.1 Mutational bias as a main factor leading to 
synonymous codon usage variation 

ENC-plot, as a general strategy, was utilized to investigate 
patterns of synonymous codon usage. The ENC-plots of 
ORFs constrained only by a C 3 +G 3 composition will lie on 
or just below the curve of the predicted values [18]. ENC 
values of RHDV genomes were plotted against its corre- 
sponding (C3+G3) %. All of the spots lie below the curve 
of the predicted values, as shown in Figure 2, suggesting 
that the codon usage bias in all these 30 RHDV genomes 
is principally influenced by the mutational bias. 



4.2 A proof for codon usage bias as a factor reducing the 
expression of VP10 

As we know, the efficiency of gene expression is influ- 
enced by regulator sequences or elements and codon 
usage bias. It reported that the RNA sequence of the 3- 
terminal 84 nucleotides of ORFlwere found to be crucial 
for VP10 expression instead of the encoded peptide. 
VP10 coding by ORF2 has been reported as a low expres- 
sive structural protein against VP60 coding by ORF1 [5]. 
And its efficiency of translation is only 20% of VP60. 
According to results showed by Table 4, it revealed the 
differences in codon usage patterns of two ORFs, which 
is a possible factor reducing the expression of VP10. 

4.3 Negative selective constraints on the RHDV whole 
genome 

Although VP 10 encoded by ORF2, as a minor structural 
protein with unknown functions, has been described by 
LIU as a nonessential protein for virus infectivity, the co 



70 



60 



> 



50 



40 



30 



0.0 



x O RF2 
+ ORF1 




.8 



1.0 



GC3% 

The continuous curve plots the relationship between GC3s and ENC in the absence of selection. 

All of spots lie below the expected curve. 

Figure 2 Effective number of codons used in each ORF plotted against the GC3s. The continuous curve plots the relationship between 
GC3s and ENC in the absence of selection. All of spots lie below the expected curve. 
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Table 7 Summary of correlation analysis between ENC 
value of 0RF1 and ENC value of 0RF2 





ENC value of 0RF1 


ENC value of ORF2 


ENC value of 


r = 1, p = 0 


r = 0.230, p = 0.222 > 


0RF1 




0.05 


ENC value of 


r = 0.230, p = 0.222 > 


r = 1, p = 0 


0RF2 


0.05 





value of ORF2 suggests VP 10 plays an important role in 
the certain stage of whole RHDV lifecycle. After com- 
bining with low expression and oo value of VP10, we 
conjectured that VP10 might be beneficial for the repli- 
cation, release or both of virus by inducing infected cell 
apoptosis initiate by RHDV. This mechanism has been 
confirmed in various positive-chain RNA viruses, includ- 
ing coxsackievirus, dengue virus, equine arterivirus, foot- 
and-mouth disease virus, hepatitis C virus, poliovirus, 
rhinovirus, and severe acute respiratory syndrome 
[23-29], although the details remain elusive. 

4.4 Independent evolution of ORF1 and ORF2 

As preceding description, ENC reflects the evolution of 
codon usage variation and nucleotide composition to 
some degree. After the correlation analysis of ENC 
values between ORF1 and ORF2 (Table 7), the related 
coefficient of ENC values of two ORFs is 0.230, and p 
value is 0.222 more than 0.05. These data revealed that 
no correlation existed in ENC values of two ORFs, indi- 
cating that codon usage patterns and evolution of two 
ORFs are separated each other. Further, this information 
maybe helps us well understand why RSCU and ENC 
between two ORFs are quite different. 

4.5 A possible genotyping basis 

Interestingly, we found that relatively isolated spots 
from ORF2 tend to cluster into two groups: the ordinate 
value of one group (marked as Group 1) is positive 
value and the other one (marked as Group 2) is negative 
value. And all of those strains isolated before 2000 
belonged to Group 2, including Italy -90, RHDV-V351, 
RHDV-FRG, BS89, RHDV-SD and M67473.1. Although 
RHDV has been reported as only one type, this may be 
a reference on dividing into two genotypes. 

5. Conclusion 

In this report, we firstly analyzed its genome and two 
open reading frameworks (ORFs) from this aspect of 
codon usage bias. Our researches indicated that muta- 
tion pressure rather than natural is the most important 
determinant in RHDV with high codon bias, and the 
codon usage bias is nearly contrary between ORF1 and 
ORF2, which is maybe one of factors regulating the 
expression of VP60 (encoding by ORF1) and VP10 



(encoding by ORF2). Furthermore, negative selective 
constraints on the RHDV whole genome implied that 
VP 10 played an important role in RHDV lifecycle. We 
conjectured that VP 10 might be beneficial for the repli- 
cation, release or both of virus by inducing infected cell 
apoptosis initiate by RHDV. According to the results of 
the principal component analysis for ORF2 of RSCU, we 
firstly separated 30 RHDV into two genotypes, and the 
ENC values indicated ORF1 and ORF2 were indepen- 
dent among the evolution of RHDV. All the results will 
guide the next researches on the RHDV as a reference. 
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